Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 20
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
bioRxiv ; 2024 Feb 08.
Artigo em Inglês | MEDLINE | ID: mdl-38187547

RESUMO

The maintenance of stable mating type polymorphisms is a classic example of balancing selection, underlying the nearly ubiquitous 50/50 sex ratio in species with separate sexes. One lesser known but intriguing example of a balanced mating polymorphism in angiosperms is heterodichogamy - polymorphism for opposing directions of dichogamy (temporal separation of male and female function in hermaphrodites) within a flowering season. This mating system is common throughout Juglandaceae, the family that includes globally important and iconic nut and timber crops - walnuts (Juglans), as well as pecan and other hickories (Carya). In both genera, heterodichogamy is controlled by a single dominant allele. We fine-map the locus in each genus, and find two ancient (>50 Mya) structural variants involving different genes that both segregate as genus-wide trans-species polymorphisms. The Juglans locus maps to a ca. 20 kb structural variant adjacent to a probable trehalose phosphate phosphatase (TPPD-1), homologs of which regulate floral development in model systems. TPPD-1 is differentially expressed between morphs in developing male flowers, with increased allele-specific expression of the dominant haplotype copy. Across species, the dominant haplotype contains a tandem array of duplicated sequence motifs, part of which is an inverted copy of the TPPD-1 3' UTR. These repeats generate various distinct small RNAs matching sequences within the 3' UTR and further downstream. In contrast to the single-gene Juglans locus, the Carya heterodichogamy locus maps to a ca. 200-450 kb cluster of tightly linked polymorphisms across 20 genes, some of which have known roles in flowering and are differentially expressed between morphs in developing flowers. The dominant haplotype in pecan, which is nearly always heterozygous and appears to rarely recombine, shows markedly reduced genetic diversity and is over twice as long as its recessive counterpart due to accumulation of various types of transposable elements. We did not detect either genetic system in other heterodichogamous genera within Juglandaceae, suggesting that additional genetic systems for heterodichogamy may yet remain undiscovered.

2.
Plant Dis ; 106(6): 1639-1644, 2022 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-35512301

RESUMO

Sugar pine, Pinus lambertiana Douglas, is a keystone species of montane forests from Baja California to southern Oregon. Like other North American white pines, populations of sugar pine have been greatly reduced by the disease white pine blister rust (WPBR) caused by a fungal pathogen, Cronartium ribicola, that was introduced into North America early in the twentieth century. Major gene resistance to WPBR segregating in natural populations has been documented in sugar pine. Indeed, the dominant resistance gene in this species, Cr1, was genetically mapped, although not precisely. Genomic single nucleotide polymorphisms (SNPs) placed in a large scaffold were reported to be associated with the allele for this major gene resistance (Cr1R). Forest restoration efforts often include sugar pine seed derived from the rare resistant individuals (typically Cr1R/Cr1r) identified through an expensive 2-year phenotypic testing program. To validate and geographically characterize the variation in this association and investigate its potential to expedite genetic improvement in forest restoration, we developed a simple PCR-based, diploid genotyping of DNA from needle tissue. By applying this to range-wide samples of susceptible and resistant (Cr1R) trees, we show that the SNPs exhibit a strong, though not complete, association with Cr1R. Paralleling earlier studies of the geographic distribution of Cr1R and the inferred demographic history of sugar pine, the resistance-associated SNPs are marginally more common in southern populations, as is the frequency of Cr1R. Although the strength of the association of the SNPs with Cr1R and thus, their predictive value, also varies with geography, the potential value of this new tool in quickly and efficiently identifying candidate WPBR-resistant seed trees is clear.


Assuntos
Pinus , Basidiomycota , Genômica , México , Pinus/genética , Pinus/microbiologia , Polimorfismo de Nucleotídeo Único/genética , Açúcares
3.
Viruses ; 13(8)2021 07 24.
Artigo em Inglês | MEDLINE | ID: mdl-34452308

RESUMO

Viruses are considered of major importance in strawberry (Fragaria × ananassa Duchesne) production given their negative impact on plant vigor and growth. Strawberry accessions from the National Clonal Germplasm Repository were screened for viruses using high throughput sequencing (HTS). Analyses of sequence information from 45 plants identified multiple variants of 14 known viruses, comprising strawberry mottle virus (SMoV), beet pseudo yellows virus (BPYV), strawberry pallidosis-associated virus (SPaV), tomato ringspot virus (ToRSV), strawberry mild yellow edge virus (SMYEV), strawberry vein banding virus (SVBV), strawberry crinkle virus (SCV), strawberry polerovirus 1 (SPV-1), apple mosaic virus (ApMV), strawberry chlorotic fleck virus (SCFaV), strawberry crinivirus 4 (SCrV-4), strawberry crinivirus 3 (SCrV-3), Fragaria chiloensis latent virus (FClLV) and Fragaria chiloensis cryptic virus (FCCV). Genetic diversity of sequenced virus isolates was investigated via sequence homology analysis, and partial-genome sequences were deposited into GenBank. To confirm the HTS results and expand the detection of strawberry viruses, new reverse transcription quantitative PCR (RT-qPCR) assays were designed for the above-listed viruses. Further in silico and in vitro validation of the new diagnostic assays indicated high efficiency and reliability. Thus, the occurrence of different viruses, including divergent variants, among the strawberries was verified. This is the first viral metagenomic survey in strawberry, additionally, this study describes the design and validation of multiple RT-qPCR assays for strawberry viruses, which represent important detection tools for clean plant programs.


Assuntos
Fragaria/virologia , Variação Genética , Sequenciamento de Nucleotídeos em Larga Escala , Doenças das Plantas/virologia , Vírus de RNA/genética , Reação em Cadeia da Polimerase Via Transcriptase Reversa/métodos , Reação em Cadeia da Polimerase Via Transcriptase Reversa/normas , Mapeamento Cromossômico , Genoma Viral , Metagenômica , Filogenia , Vírus de RNA/classificação , Reprodutibilidade dos Testes
4.
Viruses ; 13(6)2021 06 11.
Artigo em Inglês | MEDLINE | ID: mdl-34208336

RESUMO

Development of High-Throughput Sequencing (HTS), also known as next generation sequencing, revolutionized diagnostic research of plant viruses. HTS outperforms bioassays and molecular diagnostic assays that are used to screen domestic and quarantine grapevine materials in data throughput, cost, scalability, and detection of novel and highly variant virus species. However, before HTS-based assays can be routinely used for plant virus diagnostics, performance specifications need to be developed and assessed. In this study, we selected 18 virus-infected grapevines as a test panel for measuring performance characteristics of an HTS-based diagnostic assay. Total nucleic acid (TNA) was extracted from petioles and dormant canes of individual samples and constructed libraries were run on Illumina NextSeq 500 instrument using a 75-bp single-end read platform. Sensitivity was 98% measured over 264 distinct virus and viroid infections with a false discovery rate (FDR) of approximately 1 in 5 positives. The results also showed that combining a spring petiole test with a fall cane test increased sensitivity to 100% for this TNA HTS assay. To evaluate extraction methodology, these results were compared to parallel dsRNA extractions. In addition, in a more detailed dilution study, the TNA HTS assay described here consistently performed well down to a dilution of 5%. In that range, sensitivity was 98% with a corresponding FDR of approximately 1 in 5. Repeatability and reproducibility were assessed at 99% and 93%, respectively. The protocol, criteria, and performance levels described here may help to standardize HTS for quality assurance and accreditation purposes in plant quarantine or certification programs.


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Doenças das Plantas/virologia , Vírus de Plantas/genética , Vitis/virologia , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Sequenciamento de Nucleotídeos em Larga Escala/normas , Técnicas de Diagnóstico Molecular/métodos , Técnicas de Diagnóstico Molecular/normas , RNA Viral , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
5.
Plant J ; 102(2): 410-423, 2020 04.
Artigo em Inglês | MEDLINE | ID: mdl-31823432

RESUMO

Juglans (walnuts), the most speciose genus in the walnut family (Juglandaceae), represents most of the family's commercially valuable fruit and wood-producing trees. It includes several species used as rootstock for their resistance to various abiotic and biotic stressors. We present the full structural and functional genome annotations of six Juglans species and one outgroup within Juglandaceae (Juglans regia, J. cathayensis, J. hindsii, J. microcarpa, J. nigra, J. sigillata and Pterocarya stenoptera) produced using BRAKER2 semi-unsupervised gene prediction pipeline and additional tools. For each annotation, gene predictors were trained using 19 tissue-specific J. regia transcriptomes aligned to the genomes. Additional functional evidence and filters were applied to multi-exonic and mono-exonic putative genes to yield between 27 000 and 44 000 high-confidence gene models per species. Comparison of gene models to the BUSCO embryophyta dataset suggested that, on average, genome annotation completeness was 85.6%. We utilized these high-quality annotations to assess gene family evolution within Juglans, and among Juglans and selected Eurosid species. We found notable contractions in several gene families in J. hindsii, including disease resistance-related wall-associated kinase (WAK), Catharanthus roseus receptor-like kinase (CrRLK1L) and others involved in abiotic stress response. Finally, we confirmed an ancient whole-genome duplication that took place in a common ancestor of Juglandaceae using site substitution comparative analysis.


Assuntos
Genoma de Planta/genética , Genômica , Juglans/genética , Transcriptoma , Resistência à Doença/genética , Juglans/fisiologia , Estresse Fisiológico
6.
Plant Biotechnol J ; 17(6): 1027-1036, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-30515952

RESUMO

Over the last 20 years, global production of Persian walnut (Juglans regia L.) has grown enormously, likely reflecting increased consumption due to its numerous benefits to human health. However, advances in genome-wide association (GWA) studies and genomic selection (GS) for agronomically important traits in walnut remain limited due to the lack of powerful genomic tools. Here, we present the development and validation of a high-density 700K single nucleotide polymorphism (SNP) array in Persian walnut. Over 609K high-quality SNPs have been thoroughly selected from a set of 9.6 m genome-wide variants, previously identified from the high-depth re-sequencing of 27 founders of the Walnut Improvement Program (WIP) of University of California, Davis. To validate the effectiveness of the array, we genotyped a collection of 1284 walnut trees, including 1167 progeny of 48 WIP families and 26 walnut cultivars. More than half of the SNPs (55.7%) fell in the highest quality class of 'Poly High Resolution' (PHR) polymorphisms, which were used to assess the WIP pedigree integrity. We identified 151 new parent-offspring relationships, all confirmed with the Mendelian inheritance test. In addition, we explored the genetic variability among cultivars of different origin, revealing how the varieties from Europe and California were differentiated from Asian accessions. Both the reconstruction of the WIP pedigree and population structure analysis confirmed the effectiveness of the Applied Biosystems™ Axiom™ J. regia 700K SNP array, which initiates a novel genomic and advanced phase in walnut genetics and breeding.


Assuntos
Genômica , Técnicas de Genotipagem , Juglans , Estudo de Associação Genômica Ampla , Genômica/métodos , Genótipo , Técnicas de Genotipagem/instrumentação , Humanos , Juglans/genética , Polimorfismo de Nucleotídeo Único/genética
7.
G3 (Bethesda) ; 8(7): 2153-2165, 2018 07 02.
Artigo em Inglês | MEDLINE | ID: mdl-29792315

RESUMO

Genomic analysis in Juglans (walnuts) is expected to transform the breeding and agricultural production of both nuts and lumber. To that end, we report here the determination of reference sequences for six additional relatives of Juglans regia: Juglans sigillata (also from section Dioscaryon), Juglans nigra, Juglans microcarpa, Juglans hindsii (from section Rhysocaryon), Juglans cathayensis (from section Cardiocaryon), and the closely related Pterocarya stenoptera While these are 'draft' genomes, ranging in size between 640Mbp and 990Mbp, their contiguities and accuracies can support powerful annotations of genomic variation that are often the foundation of new avenues of research and breeding. We annotated nucleotide divergence and synteny by creating complete pairwise alignments of each reference genome to the remaining six. In addition, we have re-sequenced a sample of accessions from four Juglans species (including regia). The variation discovered in these surveys comprises a critical resource for experimentation and breeding, as well as a solid complementary annotation. To demonstrate the potential of these resources the structural and sequence variation in and around the polyphenol oxidase loci, PPO1 and PPO2 were investigated. As reported for other seed crops variation in this gene is implicated in the domestication of walnuts. The apparently Juglandaceae specific PPO1 duplicate shows accelerated divergence and an excess of amino acid replacement on the lineage leading to accessions of the domesticated nut crop species, Juglans regia and sigillata.


Assuntos
Variação Genética , Genoma de Planta , Genômica , Juglans/classificação , Juglans/genética , Biologia Computacional/métodos , Evolução Molecular , Tamanho do Genoma , Genômica/métodos , Genótipo , Sequenciamento de Nucleotídeos em Larga Escala , Anotação de Sequência Molecular , Filogenia , Polimorfismo de Nucleotídeo Único
8.
Gigascience ; 6(10): 1, 2017 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-29020755

RESUMO

The 22-gigabase genome of loblolly pine (Pinus taeda) is one of the largest ever sequenced. The draft assembly published in 2014 was built entirely from short Illumina reads, with lengths ranging from 100 to 250 base pairs (bp). The assembly was quite fragmented, containing over 11 million contigs whose weighted average (N50) size was 8206 bp. To improve this result, we generated approximately 12-fold coverage in long reads using the Single Molecule Real Time sequencing technology developed at Pacific Biosciences. We assembled the long and short reads together using the MaSuRCA mega-reads assembly algorithm, which produced a substantially better assembly, P. taeda version 2.0. The new assembly has an N50 contig size of 25 361, more than three times as large as achieved in the original assembly, and an N50 scaffold size of 107 821, 61% larger than the previous assembly.

9.
G3 (Bethesda) ; 7(9): 3157-3167, 2017 09 07.
Artigo em Inglês | MEDLINE | ID: mdl-28751502

RESUMO

A reference genome sequence for Pseudotsuga menziesii var. menziesii (Mirb.) Franco (Coastal Douglas-fir) is reported, thus providing a reference sequence for a third genus of the family Pinaceae. The contiguity and quality of the genome assembly far exceeds that of other conifer reference genome sequences (contig N50 = 44,136 bp and scaffold N50 = 340,704 bp). Incremental improvements in sequencing and assembly technologies are in part responsible for the higher quality reference genome, but it may also be due to a slightly lower exact repeat content in Douglas-fir vs. pine and spruce. Comparative genome annotation with angiosperm species reveals gene-family expansion and contraction in Douglas-fir and other conifers which may account for some of the major morphological and physiological differences between the two major plant groups. Notable differences in the size of the NDH-complex gene family and genes underlying the functional basis of shade tolerance/intolerance were observed. This reference genome sequence not only provides an important resource for Douglas-fir breeders and geneticists but also sheds additional light on the evolutionary processes that have led to the divergence of modern angiosperms from the more ancient gymnosperms.


Assuntos
Genoma de Planta , Fotossíntese/genética , Pinaceae/genética , Pinaceae/metabolismo , Pseudotsuga/genética , Pseudotsuga/metabolismo , Sequenciamento Completo do Genoma , Adaptação Biológica/genética , Biologia Computacional , Evolução Molecular , Duplicação Gênica , Redes Reguladoras de Genes , Genômica , Anotação de Sequência Molecular , Família Multigênica , Filogenia , Pinaceae/classificação , Proteômica/métodos , Pseudotsuga/classificação , Sequências Repetitivas de Ácido Nucleico
10.
Gigascience ; 6(1): 1-4, 2017 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-28369353

RESUMO

The 22-gigabase genome of loblolly pine (Pinus taeda) is one of the largest ever sequenced. The draft assembly published in 2014 was built entirely from short Illumina reads, with lengths ranging from 100 to 250 base pairs (bp). The assembly was quite fragmented, containing over 11 million contigs whose weighted average (N50) size was 8206 bp. To improve this result, we generated approximately 12-fold coverage in long reads using the Single Molecule Real Time sequencing technology developed at Pacific Biosciences. We assembled the long and short reads together using the MaSuRCA mega-reads assembly algorithm, which produced a substantially better assembly, P. taeda version 2.0. The new assembly has an N50 contig size of 25 361, more than three times as large as achieved in the original assembly, and an N50 scaffold size of 107 821, 61% larger than the previous assembly.


Assuntos
Mapeamento de Sequências Contíguas , Genoma de Planta , Sequenciamento de Nucleotídeos em Larga Escala , Pinus taeda/genética , Análise de Sequência de DNA , Algoritmos , Genômica
11.
G3 (Bethesda) ; 7(5): 1563-1568, 2017 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-28341701

RESUMO

We investigate the utility and scalability of new read cloud technologies to improve the draft genome assemblies of the colossal, and largely repetitive, genomes of conifers. Synthetic long read technologies have existed in various forms as a means of reducing complexity and resolving repeats since the outset of genome assembly. Recently, technologies that combine subhaploid pools of high molecular weight DNA with barcoding on a massive scale have brought new efficiencies to sample preparation and data generation. When combined with inexpensive light shotgun sequencing, the resulting data can be used to scaffold large genomes. The protocol is efficient enough to consider routinely for even the largest genomes. Conifers represent the largest reference genome projects executed to date. The largest of these is that of the conifer Pinus lambertiana (sugar pine), with a genome size of 31 billion bp. In this paper, we report on the molecular and computational protocols for scaffolding the P. lambertiana genome using the library technology from 10× Genomics. At 247,000 bp, the NG50 of the existing reference sequence is the highest scaffold contiguity among the currently published conifer assemblies; this new assembly's NG50 is 1.94 million bp, an eightfold increase.


Assuntos
Mapeamento de Sequências Contíguas/métodos , Genoma de Planta , Pinus/genética , Extratos Vegetais/genética , Sequenciamento Completo do Genoma/métodos , Algoritmos , Bálsamos , Mapeamento de Sequências Contíguas/normas , Padrões de Referência , Sequenciamento Completo do Genoma/normas
12.
G3 (Bethesda) ; 6(12): 3787-3802, 2016 12 07.
Artigo em Inglês | MEDLINE | ID: mdl-27799338

RESUMO

Sugar pine (Pinus lambertiana Douglas) is within the subgenus Strobus with an estimated genome size of 31 Gbp. Transcriptomic resources are of particular interest in conifers due to the challenges presented in their megagenomes for gene identification. In this study, we present the first comprehensive survey of the P. lambertiana transcriptome through deep sequencing of a variety of tissue types to generate more than 2.5 billion short reads. Third generation, long reads generated through PacBio Iso-Seq have been included for the first time in conifers to combat the challenges associated with de novo transcriptome assembly. A technology comparison is provided here to contribute to the otherwise scarce comparisons of second and third generation transcriptome sequencing approaches in plant species. In addition, the transcriptome reference was essential for gene model identification and quality assessment in the parallel project responsible for sequencing and assembly of the entire genome. In this study, the transcriptomic data were also used to address questions surrounding lineage-specific Dicer-like proteins in conifers. These proteins play a role in the control of transposable element proliferation and the related genome expansion in conifers.


Assuntos
Genes de Plantas , Genoma de Planta , Genômica , Pinus/genética , Biologia Computacional/métodos , Perfilação da Expressão Gênica , Regulação da Expressão Gênica de Plantas , Variação Genética , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala , MicroRNAs/genética , Anotação de Sequência Molecular , Família Multigênica , Ribonuclease III/genética , Transcriptoma
13.
Genetics ; 204(4): 1613-1626, 2016 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-27794028

RESUMO

Until very recently, complete characterization of the megagenomes of conifers has remained elusive. The diploid genome of sugar pine (Pinus lambertiana Dougl.) has a highly repetitive, 31 billion bp genome. It is the largest genome sequenced and assembled to date, and the first from the subgenus Strobus, or white pines, a group that is notable for having the largest genomes among the pines. The genome represents a unique opportunity to investigate genome "obesity" in conifers and white pines. Comparative analysis of P. lambertiana and P. taeda L. reveals new insights on the conservation, age, and diversity of the highly abundant transposable elements, the primary factor determining genome size. Like most North American white pines, the principal pathogen of P. lambertiana is white pine blister rust (Cronartium ribicola J.C. Fischer ex Raben.). Identification of candidate genes for resistance to this pathogen is of great ecological importance. The genome sequence afforded us the opportunity to make substantial progress on locating the major dominant gene for simple resistance hypersensitive response, Cr1 We describe new markers and gene annotation that are both tightly linked to Cr1 in a mapping population, and associated with Cr1 in unrelated sugar pine individuals sampled throughout the species' range, creating a solid foundation for future mapping. This genomic variation and annotated candidate genes characterized in our study of the Cr1 region are resources for future marker-assisted breeding efforts as well as for investigations of fundamental mechanisms of invasive disease and evolutionary response.


Assuntos
Genoma de Planta , Pinus/genética , Basidiomycota/patogenicidade , Elementos de DNA Transponíveis , Variação Genética , Tamanho do Genoma , Pinus/imunologia , Pinus/microbiologia , Imunidade Vegetal/genética
14.
Plant J ; 87(5): 507-32, 2016 09.
Artigo em Inglês | MEDLINE | ID: mdl-27145194

RESUMO

The Persian walnut (Juglans regia L.), a diploid species native to the mountainous regions of Central Asia, is the major walnut species cultivated for nut production and is one of the most widespread tree nut species in the world. The high nutritional value of J. regia nuts is associated with a rich array of polyphenolic compounds, whose complete biosynthetic pathways are still unknown. A J. regia genome sequence was obtained from the cultivar 'Chandler' to discover target genes and additional unknown genes. The 667-Mbp genome was assembled using two different methods (SOAPdenovo2 and MaSuRCA), with an N50 scaffold size of 464 955 bp (based on a genome size of 606 Mbp), 221 640 contigs and a GC content of 37%. Annotation with MAKER-P and other genomic resources yielded 32 498 gene models. Previous studies in walnut relying on tissue-specific methods have only identified a single polyphenol oxidase (PPO) gene (JrPPO1). Enabled by the J. regia genome sequence, a second homolog of PPO (JrPPO2) was discovered. In addition, about 130 genes in the large gallate 1-ß-glucosyltransferase (GGT) superfamily were detected. Specifically, two genes, JrGGT1 and JrGGT2, were significantly homologous to the GGT from Quercus robur (QrGGT), which is involved in the synthesis of 1-O-galloyl-ß-d-glucose, a precursor for the synthesis of hydrolysable tannins. The reference genome for J. regia provides meaningful insight into the complex pathways required for the synthesis of polyphenols. The walnut genome sequence provides important tools and methods to accelerate breeding and to facilitate the genetic dissection of complex traits.


Assuntos
Genoma de Planta/genética , Juglans/genética , Proteínas de Plantas/genética , Polifenóis/metabolismo , Catecol Oxidase/metabolismo
15.
Genetics ; 199(4): 1229-41, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25631317

RESUMO

Hundreds of wild-derived Drosophila melanogaster genomes have been published, but rigorous comparisons across data sets are precluded by differences in alignment methodology. The most common approach to reference-based genome assembly is a single round of alignment followed by quality filtering and variant detection. We evaluated variations and extensions of this approach and settled on an assembly strategy that utilizes two alignment programs and incorporates both substitutions and short indels to construct an updated reference for a second round of mapping prior to final variant detection. Utilizing this approach, we reassembled published D. melanogaster population genomic data sets and added unpublished genomes from several sub-Saharan populations. Most notably, we present aligned data from phase 3 of the Drosophila Population Genomics Project (DPGP3), which provides 197 genomes from a single ancestral range population of D. melanogaster (from Zambia). The large sample size, high genetic diversity, and potentially simpler demographic history of the DPGP3 sample will make this a highly valuable resource for fundamental population genetic research. The complete set of assemblies described here, termed the Drosophila Genome Nexus, presently comprises 623 consistently aligned genomes and is publicly available in multiple formats with supporting documentation and bioinformatic tools. This resource will greatly facilitate population genomic analysis in this model species by reducing the methodological differences between data sets.


Assuntos
Bases de Dados de Ácidos Nucleicos , Drosophila melanogaster/genética , Genoma de Inseto , Polimorfismo Genético , Animais , Sequência de Bases , Mapeamento de Sequências Contíguas , Dados de Sequência Molecular , Alinhamento de Sequência
16.
Genetics ; 196(3): 875-90, 2014 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-24653210

RESUMO

Conifers are the predominant gymnosperm. The size and complexity of their genomes has presented formidable technical challenges for whole-genome shotgun sequencing and assembly. We employed novel strategies that allowed us to determine the loblolly pine (Pinus taeda) reference genome sequence, the largest genome assembled to date. Most of the sequence data were derived from whole-genome shotgun sequencing of a single megagametophyte, the haploid tissue of a single pine seed. Although that constrained the quantity of available DNA, the resulting haploid sequence data were well-suited for assembly. The haploid sequence was augmented with multiple linking long-fragment mate pair libraries from the parental diploid DNA. For the longest fragments, we used novel fosmid DiTag libraries. Sequences from the linking libraries that did not match the megagametophyte were identified and removed. Assembly of the sequence data were aided by condensing the enormous number of paired-end reads into a much smaller set of longer "super-reads," rendering subsequent assembly with an overlap-based assembly algorithm computationally feasible. To further improve the contiguity and biological utility of the genome sequence, additional scaffolding methods utilizing independent genome and transcriptome assemblies were implemented. The combination of these strategies resulted in a draft genome sequence of 20.15 billion bases, with an N50 scaffold size of 66.9 kbp.


Assuntos
Genoma de Planta , Óvulo Vegetal/genética , Pinus taeda/genética , Genômica , Haploidia , Análise de Sequência de DNA , Transcriptoma
17.
Genetics ; 196(3): 891-909, 2014 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-24653211

RESUMO

The largest genus in the conifer family Pinaceae is Pinus, with over 100 species. The size and complexity of their genomes (∼20-40 Gb, 2n = 24) have delayed the arrival of a well-annotated reference sequence. In this study, we present the annotation of the first whole-genome shotgun assembly of loblolly pine (Pinus taeda L.), which comprises 20.1 Gb of sequence. The MAKER-P annotation pipeline combined evidence-based alignments and ab initio predictions to generate 50,172 gene models, of which 15,653 are classified as high confidence. Clustering these gene models with 13 other plant species resulted in 20,646 gene families, of which 1554 are predicted to be unique to conifers. Among the conifer gene families, 159 are composed exclusively of loblolly pine members. The gene models for loblolly pine have the highest median and mean intron lengths of 24 fully sequenced plant genomes. Conifer genomes are full of repetitive DNA, with the most significant contributions from long-terminal-repeat retrotransposons. In depth analysis of the tandem and interspersed repetitive content yielded a combined estimate of 82%.


Assuntos
Genoma de Planta , Anotação de Sequência Molecular/métodos , Pinus taeda/genética , DNA de Plantas/análise , Evolução Molecular , Genes de Plantas , Família Multigênica , Filogenia , Alinhamento de Sequência
18.
Genome Biol ; 15(3): R59, 2014 Mar 04.
Artigo em Inglês | MEDLINE | ID: mdl-24647006

RESUMO

BACKGROUND: The size and complexity of conifer genomes has, until now, prevented full genome sequencing and assembly. The large research community and economic importance of loblolly pine, Pinus taeda L., made it an early candidate for reference sequence determination. RESULTS: We develop a novel strategy to sequence the genome of loblolly pine that combines unique aspects of pine reproductive biology and genome assembly methodology. We use a whole genome shotgun approach relying primarily on next generation sequence generated from a single haploid seed megagametophyte from a loblolly pine tree, 20-1010, that has been used in industrial forest tree breeding. The resulting sequence and assembly was used to generate a draft genome spanning 23.2 Gbp and containing 20.1 Gbp with an N50 scaffold size of 66.9 kbp, making it a significant improvement over available conifer genomes. The long scaffold lengths allow the annotation of 50,172 gene models with intron lengths averaging over 2.7 kbp and sometimes exceeding 100 kbp in length. Analysis of orthologous gene sets identifies gene families that may be unique to conifers. We further characterize and expand the existing repeat library based on the de novo analysis of the repetitive content, estimated to encompass 82% of the genome. CONCLUSIONS: In addition to its value as a resource for researchers and breeders, the loblolly pine genome sequence and assembly reported here demonstrates a novel approach to sequencing the large and complex genomes of this important group of plants that can now be widely applied.


Assuntos
Mapeamento de Sequências Contíguas/métodos , Genoma de Planta , Pinus taeda/genética , Análise de Sequência de DNA/métodos , DNA de Plantas/genética , Haploidia
19.
PLoS One ; 8(9): e72439, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24023741

RESUMO

Despite their prevalence and importance, the genome sequences of loblolly pine, Norway spruce, and white spruce, three ecologically and economically important conifer species, are just becoming available to the research community. Following the completion of these large assemblies, annotation efforts will be undertaken to characterize the reference sequences. Accurate annotation of these ancient genomes would be aided by a comprehensive repeat library; however, few studies have generated enough sequence to fully evaluate and catalog their non-genic content. In this paper, two sets of loblolly pine genomic sequence, 103 previously assembled BACs and 90,954 newly sequenced and assembled fosmid scaffolds, were analyzed. Together, this sequence represents 280 Mbp (roughly 1% of the loblolly pine genome) and one of the most comprehensive studies of repetitive elements and genes in a gymnosperm species. A combination of homology and de novo methodologies were applied to identify both conserved and novel repeats. Similarity analysis estimated a repetitive content of 27% that included both full and partial elements. When combined with the de novo investigation, the estimate increased to almost 86%. Over 60% of the repetitive sequence consists of full or partial LTR (long terminal repeat) retrotransposons. Through de novo approaches, 6,270 novel, full-length transposable element families and 9,415 sub-families were identified. Among those 6,270 families, 82% were annotated as single-copy. Several of the novel, high-copy families are described here, with the largest, PtPiedmont, comprising 133 full-length copies. In addition to repeats, analysis of the coding region reported 23 full-length eukaryotic orthologous proteins (KOGS) and another 29 novel or orthologous genes. These discoveries, along with other genomic resources, will be used to annotate conifer genomes and address long-standing questions about gymnosperm evolution.


Assuntos
Cromossomos Artificiais Bacterianos/genética , Genoma de Planta/genética , Pinus taeda/genética , Retroelementos/genética
20.
PLoS Genet ; 8(12): e1003080, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23284287

RESUMO

Drosophila melanogaster has played a pivotal role in the development of modern population genetics. However, many basic questions regarding the demographic and adaptive history of this species remain unresolved. We report the genome sequencing of 139 wild-derived strains of D. melanogaster, representing 22 population samples from the sub-Saharan ancestral range of this species, along with one European population. Most genomes were sequenced above 25X depth from haploid embryos. Results indicated a pervasive influence of non-African admixture in many African populations, motivating the development and application of a novel admixture detection method. Admixture proportions varied among populations, with greater admixture in urban locations. Admixture levels also varied across the genome, with localized peaks and valleys suggestive of a non-neutral introgression process. Genomes from the same location differed starkly in ancestry, suggesting that isolation mechanisms may exist within African populations. After removing putatively admixed genomic segments, the greatest genetic diversity was observed in southern Africa (e.g. Zambia), while diversity in other populations was largely consistent with a geographic expansion from this potentially ancestral region. The European population showed different levels of diversity reduction on each chromosome arm, and some African populations displayed chromosome arm-specific diversity reductions. Inversions in the European sample were associated with strong elevations in diversity across chromosome arms. Genomic scans were conducted to identify loci that may represent targets of positive selection within an African population, between African populations, and between European and African populations. A disproportionate number of candidate selective sweep regions were located near genes with varied roles in gene regulation. Outliers for Europe-Africa F(ST) were found to be enriched in genomic regions of locally elevated cosmopolitan admixture, possibly reflecting a role for some of these loci in driving the introgression of non-African alleles into African populations.


Assuntos
Drosophila melanogaster/genética , Variação Genética , Genoma de Inseto , Metagenômica , Adaptação Fisiológica/genética , África Subsaariana , Alelos , Animais , Sequência de Bases , Europa (Continente) , Evolução Molecular , Sequenciamento de Nucleotídeos em Larga Escala , Seleção Genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...